Márcio Luís Acencio scite author profile

The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.

show abstract

The genome sequence of the plant pathogen Xylella fastidiosa

Simpson

Reinach

Arruda

et al. 2000

Nature

815

603

View full text Add to dashboard Cite

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis--a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium-bacterium and bacterium-host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer.

show abstract

HTRIdb: an open-access database for experimentally verified human transcriptional regulation interactions

2012

View full text Add to dashboard Cite

BackgroundThe modeling of interactions among transcription factors (TFs) and their respective target genes (TGs) into transcriptional regulatory networks is important for the complete understanding of regulation of biological processes. In the case of experimentally verified human TF-TG interactions, there is no database at present that explicitly provides such information even though many databases containing human TF-TG interaction data have been available. In an effort to provide researchers with a repository of experimentally verified human TF-TG interactions from which such interactions can be directly extracted, we present here the Human Transcriptional Regulation Interactions database (HTRIdb).DescriptionThe HTRIdb is an open-access database that can be searched via a user-friendly web interface and the retrieved TF-TG interactions data and the associated protein-protein interactions can be downloaded or interactively visualized as a network through the web version of the popular Cytoscape visualization tool, the Cytoscape Web. Moreover, users can improve the database quality by uploading their own interactions and indicating inconsistencies in the data. So far, HTRIdb has been populated with 284 TFs that regulate 18302 genes, totaling 51871 TF-TG interactions. HTRIdb is freely available at http://www.lbbc.ibb.unesp.br/htri.ConclusionsHTRIdb is a powerful user-friendly tool from which human experimentally validated TF-TG interactions can be easily extracted and used to construct transcriptional regulation interaction networks enabling researchers to decipher the regulation of biological processes.

show abstract

Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

2009

View full text Add to dashboard Cite

Background: The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learningbased computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes.

show abstract

Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

2016

View full text Add to dashboard Cite

Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research.

show abstract

A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data

2010

View full text Add to dashboard Cite

BackgroundThe genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.ResultsIn this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.ConclusionsWe were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.

show abstract

COVID19 Disease Map, a computational knowledge repository of virus–host interaction mechanisms

Ostaszewski

Niarakis

Mazein

et al. 2021

Molecular Systems Biology

View full text Add to dashboard Cite

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.

show abstract

HTRIdb: an open-access database for experimentally verified human transcriptional regulation interactions

2012

View full text Add to dashboard Cite

Background: The modeling of interactions among transcription factors (TFs) and their respective target genes (TGs) into transcriptional regulatory networks is important for the complete understanding of regulation of biological processes. In the case of human TF-TG interactions, there is no database at present that explicitly provides such information even though many databases containing human TF-TG interaction data have been available. In an effort to provide researchers with a repository of TF-TG interactions from which such interactions can be directly extracted, we present here the Human Transcriptional Regulation Interactions database (HTRIdb).Description: The HTRIdb is an open-access database of experimentally validated interactions among human TFs and their TGs. HTRIdb can be searched via a user-friendly web interface and the retrieved TF-TG interactions data and the associated protein-protein interactions can be downloaded or interactively visualized as a network using the Cytoscape Web software. Moreover, users can improve the database quality by uploading their own interactions and indicating inconsistencies in the data. So far, HTRIdb has been populated with 283 TFs that regulate 11886 genes, totaling 18160 TF-TG interactions. HTRIdb is freely available at http://www.lbbc.ibb.unesp.br/htri. Conclusions: HTRIdb is a powerful user-friendly tool from which human experimentally validated TF-TG interactions can be easily extracted and used to construct transcriptional regulation interaction networks enabling researchers to decipher the regulation of biological processes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.