BackgroundThe Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.ResultsHere, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.ConclusionWe conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
† People involved in the organization of the challenge. ‡ People contributing data from their institutions.§ Equal senior authors.
PURPOSE Variation in risk of adverse clinical outcomes in patients with cancer and COVID-19 has been reported from relatively small cohorts. The NCATS’ National COVID Cohort Collaborative (N3C) is a centralized data resource representing the largest multicenter cohort of COVID-19 cases and controls nationwide. We aimed to construct and characterize the cancer cohort within N3C and identify risk factors for all-cause mortality from COVID-19. METHODS We used 4,382,085 patients from 50 US medical centers to construct a cohort of patients with cancer. We restricted analyses to adults ≥ 18 years old with a COVID-19–positive or COVID-19–negative diagnosis between January 1, 2020, and March 25, 2021. We followed N3C selection of an index encounter per patient for analyses. All analyses were performed in the N3C Data Enclave Palantir platform. RESULTS A total of 398,579 adult patients with cancer were identified from the N3C cohort; 63,413 (15.9%) were COVID-19–positive. Most common represented cancers were skin (13.8%), breast (13.7%), prostate (10.6%), hematologic (10.5%), and GI cancers (10%). COVID-19 positivity was significantly associated with increased risk of all-cause mortality (hazard ratio, 1.20; 95% CI, 1.15 to 1.24). Among COVID-19–positive patients, age ≥ 65 years, male gender, Southern or Western US residence, an adjusted Charlson Comorbidity Index score ≥ 4, hematologic malignancy, multitumor sites, and recent cytotoxic therapy were associated with increased risk of all-cause mortality. Patients who received recent immunotherapies or targeted therapies did not have higher risk of overall mortality. CONCLUSION Using N3C, we assembled the largest nationally representative cohort of patients with cancer and COVID-19 to date. We identified demographic and clinical factors associated with increased all-cause mortality in patients with cancer. Full characterization of the cohort will provide further insights into the effects of COVID-19 on cancer outcomes and the ability to continue specific cancer treatments.
PURPOSE To provide real-world evidence on risks and outcomes of breakthrough COVID-19 infections in vaccinated patients with cancer using the largest national cohort of COVID-19 cases and controls. METHODS We used the National COVID Cohort Collaborative (N3C) to identify breakthrough infections between December 1, 2020, and May 31, 2021. We included patients partially or fully vaccinated with mRNA COVID-19 vaccines with no prior SARS-CoV-2 infection record. Risks for breakthrough infection and severe outcomes were analyzed using logistic regression. RESULTS A total of 6,860 breakthrough cases were identified within the N3C-vaccinated population, among whom 1,460 (21.3%) were patients with cancer. Solid tumors and hematologic malignancies had significantly higher risks for breakthrough infection (odds ratios [ORs] = 1.12, 95% CI, 1.01 to 1.23 and 4.64, 95% CI, 3.98 to 5.38) and severe outcomes (ORs = 1.33, 95% CI, 1.09 to 1.62 and 1.45, 95% CI, 1.08 to 1.95) compared with noncancer patients, adjusting for age, sex, race/ethnicity, smoking status, vaccine type, and vaccination date. Compared with solid tumors, hematologic malignancies were at increased risk for breakthrough infections (adjusted OR ranged from 2.07 for lymphoma to 7.25 for lymphoid leukemia). Breakthrough risk was reduced after the second vaccine dose for all cancers (OR = 0.04; 95% CI, 0.04 to 0.05), and for Moderna's mRNA-1273 compared with Pfizer's BNT162b2 vaccine (OR = 0.66; 95% CI, 0.62 to 0.70), particularly in patients with multiple myeloma (OR = 0.35; 95% CI, 0.15 to 0.72). Medications with major immunosuppressive effects and bone marrow transplantation were strongly associated with breakthrough risk among the vaccinated population. CONCLUSION Real-world evidence shows that patients with cancer, especially hematologic malignancies, are at higher risk for developing breakthrough infections and severe outcomes. Patients with vaccination were at markedly decreased risk for breakthrough infections. Further work is needed to assess boosters and new SARS-CoV-2 variants.
The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Here we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility (P. aureginosa only). We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. We conclude that, while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. We finally report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bioontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens. 157 project. Predicting GO terms for a protein (protein-centric) and predicting which proteins are associated 158 with a given function (term-centric) are related but different computational problems: the former is a 159 multi-label classification problem with a structured output, while the latter is a binary classification task. 160Predicting the results of a genome-wide screen for a single or a small number of functions fits the term-centric 161 formulation. To see how well all participating CAFA methods perform term-centric predictions, we mapped 162 results from the protein-centric CAFA3 methods onto these terms. In addition we held a separate CAFA 163 challenge, CAFA-π whose purpose was to attract additional submissions from algorithms that specialize in 164 term-centric tasks. 165 We performed screens for three functions in three species, which we then used to assess protein function 166 prediction. In the bacterium Pseudomonas aeruginosa and the fungus Candida albicans we performed 167 genome-wide screens capable of uncovering genes with two functions, biofilm formation (GO:0042710) and 168 motility (for P. aeruginosa only) (GO:0001539), as described in Methods. In Drosophila melanogaster we 169 performed targeted assays, guided by previous CAFA submissions, of a ...
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-017-0509-y) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.